scRNAseq transcriptome processing was performed using the Chromium 10x system involving GEM generation, post GEM-generation clean-up, cDNA amplification and DNA quantification. The library was sequencing using the Illumina NovaSeq platform.
Chromium Single Cell Reagent Kits solution (10X SC RNA 5pr, 10X SC VDJ TCR Chemistry) was used to deliver a scalable microfluidic platform for digital GEX’, and ’VDJ by profiling 500-10,000 individual cells per sample. A pool of ~3,500,000 10x Barcodes were sampled separately to index each cell’s transcriptome. It is done by partitioning thousands of cells into nanoliter-scale Gel Beads-in-emulsion (GEMs), where all generated cDNA share a common 10x Barcode. Libraries were generated and sequenced from the cDNAs and 10x Barcodes were used to associate individual reads back to the individual partitions.
Analysis pipeline is applied to process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis. Illumina’s bcl2fastq and cellranger mkfastq demultiplexes are used to convert the raw base call (BCL) files generated by Illumina sequencers into FASTQ files.
Cellranger count takes FASTQ files from cellranger mkfastq and performs alignment, filtering, barcode counting, and UMI counting. It uses the Chromium cellular barcodes to generate feature-barcode matrices, determine clusters, and perform gene expression analysis. In the count pipeline, one sample is processed through one GEM well and sequenced on one flowcell. In this case, generate FASTQs using cellranger mkfastq and run cellranger count as Single-Sample Analysis. Additionally, the pipelines perform secondary analyses like dimensionality reduction, clustering, and differential analysis.
In the grouped count pipeline, one sample is processed through one GEM well, resulting in one library which is sequenced across multiple flowcells. This workflow is commonly performed to increase sequencing depth. In this case, all reads can be combined in a single instance of the cellranger count pipeline. Finally, the pipelines perform secondary analyses like dimensionality reduction, clustering, and differential analysis.
To generate single-cell V(D)J sequences and annotations for a single library, run cellranger vdj with the following parameters.
Metadata is created at multiple points throughout the pipeline. This document includes information about the size of the batch load and resources that can be used to conduct QC.
Quality control for 10x Genomics single-cell RNA-seq data, Unique gene counts for cells, separated by samples.
Gene Expression profile
Gene Expression profiles - grouped
Single Cell Immune Profiling
We applied the area under the curve and bimodal distribution to separate the distributions and evaluate the strength of enrichment of each reference cell with genes in an indicated cell. These results can be used for objective selection of insightful optimal cluster numbers and discriminate between true variation and background noise.
Samples
Uniform Manifold Approximation and Projection (UMAP) projection of transcriptionally and functionally distinct clusters, highlighted by cell type group. UMAP constructs a high-dimensional graph representation of the data, then builds a low-dimensional graph that is as structurally similar as possible.
GML: Grouped Multiple Libraries per sample
Uniform Manifold Approximation and Projection (UMAP) projection of transcriptionally and functionally distinct clusters, highlighted by cell type group. UMAP constructs a high-dimensional graph representation of the data, then builds a low-dimensional graph that is as structurally similar as possible.
Samples
t-stochastic neighbor embedding (t-SNE) projection of transcriptionally and functionally distinct clusters, highlighted by cell type group.
GML: Grouped Multiple Libraries per sample
t-stochastic neighbor embedding (t-SNE) projection of transcriptionally and functionally distinct clusters, highlighted by cell type group.
Samples
Heatmap based on cells showing the most enriched expressed genes in each cell type group.
GML: Grouped Multiple Libraries per sample
Heatmap based on cells showing the most enriched expressed genes in each cell type group.
Samples
The Barcode Rank Plot shows the distribution of non-duplicate reads with mapping quality at least 30 per barcode and which barcodes were inferred to be associated with cells. The y-axis shows the value that cellranger-dna uses to call cells and the x-axis is the number of barcodes below that value.
GML: Grouped Multiple Libraries per sample
The Barcode Rank Plot shows the distribution of non-duplicate reads with mapping quality at least 30 per barcode and which barcodes were inferred to be associated with cells. The y-axis shows the value that cellranger-dna uses to call cells and the x-axis is the number of barcodes below that value. The Barcode Rank Plot is equivalent to a transposed empirical cumulative density plot with log-transformed axes.
The analysis, gene count matrix, and reports have been copied to the “additional_analyses” folder on the FTP server (LINK).